Parallelising Harvesting
نویسنده
چکیده
Metadata harvesting has become a common technique to transfer a stream of data from one metadata repository or digital library system to another. As collections of metadata, and their associated digital objects, grow in size, the ingest of these items at the destination archive can take a significant amount of time, depending on the type of indexing or post-processing that is required. This paper discusses an approach to parallelise the post-processing of data in a small cluster of machines or a multi-processor environment, while not increasing the burden on the source data provider. Performance tests have been carried out on varying architectures and the results indicate that this technique is indeed promising for some scenarios and can be extended to more computationally-intensive ingest procedures. In general, the technique presents a new approach for the construction of harvest-based distributed or component-based digital libraries, with better scalability than before.
منابع مشابه
Loop parallelisation using a simple parallelising compiler with the RHODOS distributed system
The procedure of parallelising a sequential task can be very difficult and even with considerable experience in the area of parallelisation, it can still be quite monotonous. Automation of parallelisation through using a parallelising compiler is often an advantageous path to pursue due to the savings in both time and effort. A major component of automatic parallelisation is loop parallelisatio...
متن کاملTowards a Parallelising COBOL Compiler
This paper briefly describes some of the fundamental issues in developing an automatic parallelising COBOL compiler. The rationale for such a tool and its connection with research in scientific computing is described. This is followed by an account of the useful forms of parallelism to be found in a COBOL program and how they may be detected using dependence analysis. Finally, transformations t...
متن کاملCalculating likely Parallelism within Dependant Conjunctions for Logic Programs
The rate at which computers are becoming faster at sequential execution has dropped significantly. Instead parallel processing power is increasing, and multicore computers are becoming more common. Automatically parallelising programs is becoming much more desirable. Parallelising programs written in imperative programming languages is difficult and often leads to unreliable software. Paralleli...
متن کاملTowards a skeleton based parallelising compiler for SML
A design for a skeleton based parallelising compiler for a pure functional subset of Standard ML is presented. The compiler will use Structural Operational Semantics based prototype instrumentation to determine communication and processing costs at sites of higher order function use. Useful parallelism will be identiied through performance models for equivalent algorithmic skeletons. Parallelis...
متن کاملExperiences in Parallelising an Aeronautics Code on the KSR1
Virtual Shared Memory (VSM) has been proposed as the solution to scalable shared memory parallel architectures. This paper reports on parallelising a scientific code from aeronautical engineering to a VSM machine, the KSR1. The code predicts the laminar to turbulent transition point of flow over an aerofoil. The experiences of initial porting and successive optimisation to examine efficiency on...
متن کامل